{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "OlDwW4HY8MoU"
   },
   "source": [
    "# Introduction\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "2pFlZCUv7hM-"
   },
   "source": [
    "\n",
    "In this notebook, we will \n",
    "- Learn how to use BoostedTrees Classifier for training and evaluating\n",
    "- Explore how training can be speeded up for small datasets\n",
    "- Will develop intuition for how some of the hyperparameters affect the performance of boosted trees.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "QGP_iZh-1SX3"
   },
   "outputs": [],
   "source": [
    "# We will use some np and pandas for dealing with input data.\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "# And of course, we need tensorflow.\n",
    "import tensorflow as tf\n",
    "\n",
    "from distutils.version import StrictVersion"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'1.13.1'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tf.__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "vfxkZE-MaY0h"
   },
   "source": [
    "# Load dataset\n",
    "We will be using the titanic dataset, where the goal is to predict passenger survival given characteristiscs such as gender, age, class, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "gd995mWZzOTz"
   },
   "outputs": [],
   "source": [
    "tf.logging.set_verbosity(tf.logging.INFO)\n",
    "tf.set_random_seed(123)\n",
    "\n",
    "# Load dataset.\n",
    "dftrain = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')\n",
    "dfeval = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv')\n",
    "y_train = dftrain.pop('survived')\n",
    "y_eval = dfeval.pop('survived')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "HPs8YoHMkB7_"
   },
   "outputs": [],
   "source": [
    "fcol = tf.feature_column\n",
    "CATEGORICAL_COLUMNS = ['sex', 'n_siblings_spouses', 'parch', 'class', 'deck',\n",
    "                       'embark_town', 'alone']\n",
    "NUMERIC_COLUMNS = ['age', 'fare']\n",
    "\n",
    "def one_hot_cat_column(feature_name, vocab):\n",
    "  return fcol.indicator_column(\n",
    "      fcol.categorical_column_with_vocabulary_list(feature_name,\n",
    "                                                 vocab))\n",
    "fc = []\n",
    "for feature_name in CATEGORICAL_COLUMNS:\n",
    "  # Need to one-hot encode categorical features.\n",
    "  vocabulary = dftrain[feature_name].unique()\n",
    "  fc.append(one_hot_cat_column(feature_name, vocabulary))\n",
    "\n",
    "for feature_name in NUMERIC_COLUMNS:\n",
    "  fc.append(fcol.numeric_column(feature_name,\n",
    "                                dtype=tf.float32))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "R_51OufwaY0o"
   },
   "outputs": [],
   "source": [
    "# Prepare the input fn. Use the entire dataset for a batch since this is such a small dataset.\n",
    "\n",
    "def make_input_fn(X, y, n_epochs=None, do_batching=True):\n",
    "  def input_fn():\n",
    "    BATCH_SIZE = len(y)\n",
    "    dataset = tf.data.Dataset.from_tensor_slices((X.to_dict(orient='list'), y))\n",
    "    # For training, cycle thru dataset as many times as need (n_epochs=None).    \n",
    "    dataset = dataset.repeat(n_epochs)  \n",
    "    if do_batching:\n",
    "      dataset = dataset.batch(BATCH_SIZE)\n",
    "    return dataset\n",
    "  return input_fn"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "DMwL7qlrAdWk"
   },
   "source": [
    "# Training and Evaluating Classifiers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "pBhdfNzXjAsT"
   },
   "outputs": [],
   "source": [
    "TRAIN_SIZE = len(dftrain)\n",
    "params = {\n",
    "  'n_trees':10,\n",
    "  'center_bias':False,\n",
    "  'l2_regularization':1./TRAIN_SIZE # regularization is per instance, so if you are familiar with XGBoost, you need to divide these values by the num of examples per layer\n",
    "}\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "hw4avA1R23dL"
   },
   "source": [
    "Train and evaluate the model. We will look at accuracy first.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 852
    },
    "colab_type": "code",
    "id": "GsMoeNiEHlox",
    "outputId": "ac640831-a46a-4c6a-b901-6663aa9f2ee9"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:tensorflow:Using default config.\n",
      "WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpaS1MMn\n",
      "INFO:tensorflow:Using config: {'_save_checkpoints_secs': 600, '_num_ps_replicas': 0, '_keep_checkpoint_max': 5, '_task_type': 'worker', '_global_id_in_cluster': 0, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f3414de5c10>, '_model_dir': '/tmp/tmpaS1MMn', '_protocol': None, '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_session_config': allow_soft_placement: true\n",
      "graph_options {\n",
      "  rewrite_options {\n",
      "    meta_optimizer_iterations: ONE\n",
      "  }\n",
      "}\n",
      ", '_tf_random_seed': None, '_save_summary_steps': 100, '_device_fn': None, '_experimental_distribute': None, '_num_worker_replicas': 1, '_task_id': 0, '_log_step_count_steps': 100, '_evaluation_master': '', '_eval_distribute': None, '_train_distribute': None, '_master': ''}\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow_estimator/python/estimator/canned/boosted_trees.py:256: _num_buckets (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.\n",
      "Instructions for updating:\n",
      "The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "Colocations handled automatically by placer.\n",
      "INFO:tensorflow:Calling model_fn.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/feature_column/feature_column.py:2121: _transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.\n",
      "Instructions for updating:\n",
      "The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/feature_column/feature_column_v2.py:2703: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "Use tf.cast instead.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/feature_column/feature_column.py:2121: _transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.\n",
      "Instructions for updating:\n",
      "The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/feature_column/feature_column_v2.py:4295: _get_sparse_tensors (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.\n",
      "Instructions for updating:\n",
      "The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/feature_column/feature_column.py:2121: _transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.\n",
      "Instructions for updating:\n",
      "The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/lookup_ops.py:1137: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "Use tf.cast instead.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/feature_column/feature_column_v2.py:4266: _variable_shape (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed after 2018-11-30.\n",
      "Instructions for updating:\n",
      "The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow_estimator/python/estimator/canned/boosted_trees.py:117: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "Use tf.cast instead.\n",
      "INFO:tensorflow:Done calling model_fn.\n",
      "INFO:tensorflow:Create CheckpointSaverHook.\n",
      "WARNING:tensorflow:Issue encountered when serializing resources.\n",
      "Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.\n",
      "'_Resource' object has no attribute 'name'\n",
      "INFO:tensorflow:Graph was finalized.\n",
      "INFO:tensorflow:Running local_init_op.\n",
      "INFO:tensorflow:Done running local_init_op.\n",
      "WARNING:tensorflow:Issue encountered when serializing resources.\n",
      "Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.\n",
      "'_Resource' object has no attribute 'name'\n",
      "INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpaS1MMn/model.ckpt.\n",
      "WARNING:tensorflow:Issue encountered when serializing resources.\n",
      "Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.\n",
      "'_Resource' object has no attribute 'name'\n",
      "INFO:tensorflow:loss = 0.6931468, step = 0\n",
      "INFO:tensorflow:Saving checkpoints for 60 into /tmp/tmpaS1MMn/model.ckpt.\n",
      "WARNING:tensorflow:Issue encountered when serializing resources.\n",
      "Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.\n",
      "'_Resource' object has no attribute 'name'\n",
      "INFO:tensorflow:Loss for final step: 0.30194622.\n",
      "INFO:tensorflow:Calling model_fn.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/metrics_impl.py:2002: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "Deprecated in favor of operator or tf.math.divide.\n",
      "WARNING:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to \"careful_interpolation\" instead.\n",
      "WARNING:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to \"careful_interpolation\" instead.\n",
      "INFO:tensorflow:Done calling model_fn.\n",
      "INFO:tensorflow:Starting evaluation at 2019-06-18T19:53:16Z\n",
      "INFO:tensorflow:Graph was finalized.\n",
      "WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.\n",
      "Instructions for updating:\n",
      "Use standard file APIs to check for files with this prefix.\n",
      "INFO:tensorflow:Restoring parameters from /tmp/tmpaS1MMn/model.ckpt-60\n",
      "INFO:tensorflow:Running local_init_op.\n",
      "INFO:tensorflow:Done running local_init_op.\n",
      "INFO:tensorflow:Finished evaluation at 2019-06-18-19:53:17\n",
      "INFO:tensorflow:Saving dict for global step 60: accuracy = 0.8068182, accuracy_baseline = 0.625, auc = 0.8663299, auc_precision_recall = 0.85031575, average_loss = 0.41991314, global_step = 60, label/mean = 0.375, loss = 0.41991314, precision = 0.75, prediction/mean = 0.3852217, recall = 0.72727275\n",
      "WARNING:tensorflow:Issue encountered when serializing resources.\n",
      "Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.\n",
      "'_Resource' object has no attribute 'name'\n",
      "INFO:tensorflow:Saving 'checkpoint_path' summary for global step 60: /tmp/tmpaS1MMn/model.ckpt-60\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "accuracy                 0.806818\n",
       "accuracy_baseline        0.625000\n",
       "auc                      0.866330\n",
       "auc_precision_recall     0.850316\n",
       "average_loss             0.419913\n",
       "global_step             60.000000\n",
       "label/mean               0.375000\n",
       "loss                     0.419913\n",
       "precision                0.750000\n",
       "prediction/mean          0.385222\n",
       "recall                   0.727273\n",
       "dtype: float64"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Training and evaluation input functions.\n",
    "n_batches_per_layer = 1  # Use one batch, consisting of the entire dataset to build each layer in the tree.\n",
    "DO_BATCHING = True\n",
    "\n",
    "train_input_fn = make_input_fn(dftrain, y_train, n_epochs=None, do_batching=DO_BATCHING)\n",
    "eval_input_fn = make_input_fn(dfeval, y_eval, n_epochs=1, do_batching=DO_BATCHING)\n",
    "est = tf.estimator.BoostedTreesClassifier(fc,\n",
    "                                          n_batches_per_layer=n_batches_per_layer,\n",
    "                                          **params)\n",
    "\n",
    "est.train(train_input_fn)\n",
    "\n",
    "# Eval.\n",
    "pd.Series(est.evaluate(eval_input_fn))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "DNN classifier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:tensorflow:Using default config.\n",
      "WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpQ_vHbC\n",
      "INFO:tensorflow:Using config: {'_save_checkpoints_secs': 600, '_num_ps_replicas': 0, '_keep_checkpoint_max': 5, '_task_type': 'worker', '_global_id_in_cluster': 0, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f34437d2610>, '_model_dir': '/tmp/tmpQ_vHbC', '_protocol': None, '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_session_config': allow_soft_placement: true\n",
      "graph_options {\n",
      "  rewrite_options {\n",
      "    meta_optimizer_iterations: ONE\n",
      "  }\n",
      "}\n",
      ", '_tf_random_seed': None, '_save_summary_steps': 100, '_device_fn': None, '_experimental_distribute': None, '_num_worker_replicas': 1, '_task_id': 0, '_log_step_count_steps': 100, '_evaluation_master': '', '_eval_distribute': None, '_train_distribute': None, '_master': ''}\n",
      "INFO:tensorflow:Calling model_fn.\n",
      "INFO:tensorflow:Done calling model_fn.\n",
      "INFO:tensorflow:Create CheckpointSaverHook.\n",
      "INFO:tensorflow:Graph was finalized.\n",
      "INFO:tensorflow:Running local_init_op.\n",
      "INFO:tensorflow:Done running local_init_op.\n",
      "INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpQ_vHbC/model.ckpt.\n",
      "INFO:tensorflow:loss = 1149.9677, step = 1\n",
      "INFO:tensorflow:global_step/sec: 60.879\n",
      "INFO:tensorflow:loss = 270.0607, step = 101 (1.646 sec)\n",
      "INFO:tensorflow:global_step/sec: 60.3321\n",
      "INFO:tensorflow:loss = 250.54703, step = 201 (1.659 sec)\n",
      "INFO:tensorflow:global_step/sec: 56.3728\n",
      "INFO:tensorflow:loss = 244.14874, step = 301 (1.772 sec)\n",
      "INFO:tensorflow:global_step/sec: 58.668\n",
      "INFO:tensorflow:loss = 239.70654, step = 401 (1.706 sec)\n",
      "INFO:tensorflow:global_step/sec: 58.1356\n",
      "INFO:tensorflow:loss = 237.23935, step = 501 (1.719 sec)\n",
      "INFO:tensorflow:global_step/sec: 51.4254\n",
      "INFO:tensorflow:loss = 235.0013, step = 601 (1.949 sec)\n",
      "INFO:tensorflow:global_step/sec: 50.1686\n",
      "INFO:tensorflow:loss = 233.78299, step = 701 (1.989 sec)\n",
      "INFO:tensorflow:global_step/sec: 62.6966\n",
      "INFO:tensorflow:loss = 232.37886, step = 801 (1.593 sec)\n",
      "INFO:tensorflow:global_step/sec: 54.9425\n",
      "INFO:tensorflow:loss = 231.2874, step = 901 (1.822 sec)\n",
      "INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmpQ_vHbC/model.ckpt.\n",
      "INFO:tensorflow:Loss for final step: 230.49117.\n",
      "INFO:tensorflow:Calling model_fn.\n",
      "WARNING:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to \"careful_interpolation\" instead.\n",
      "WARNING:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to \"careful_interpolation\" instead.\n",
      "INFO:tensorflow:Done calling model_fn.\n",
      "INFO:tensorflow:Starting evaluation at 2019-06-18T19:54:06Z\n",
      "INFO:tensorflow:Graph was finalized.\n",
      "INFO:tensorflow:Restoring parameters from /tmp/tmpQ_vHbC/model.ckpt-1000\n",
      "INFO:tensorflow:Running local_init_op.\n",
      "INFO:tensorflow:Done running local_init_op.\n",
      "INFO:tensorflow:Finished evaluation at 2019-06-18-19:54:07\n",
      "INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.81439394, accuracy_baseline = 0.625, auc = 0.8449648, auc_precision_recall = 0.79683095, average_loss = 0.48387054, global_step = 1000, label/mean = 0.375, loss = 127.74182, precision = 0.74038464, prediction/mean = 0.40750384, recall = 0.7777778\n",
      "INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmpQ_vHbC/model.ckpt-1000\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "accuracy                   0.814394\n",
       "accuracy_baseline          0.625000\n",
       "auc                        0.844965\n",
       "auc_precision_recall       0.796831\n",
       "average_loss               0.483871\n",
       "global_step             1000.000000\n",
       "label/mean                 0.375000\n",
       "loss                     127.741821\n",
       "precision                  0.740385\n",
       "prediction/mean            0.407504\n",
       "recall                     0.777778\n",
       "dtype: float64"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Training and evaluation input functions.\n",
    "est = tf.estimator.DNNClassifier([10], fc)\n",
    "est.train(train_input_fn, max_steps=1000)\n",
    "pd.Series(est.evaluate(eval_input_fn))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "QemELZhyBZYc"
   },
   "source": [
    "## Improving the performance\n",
    "\n",
    "**???** Can you get better performance out of the classifier?  What parameters the boosted trees are most sensitive to?\n",
    "\n",
    "**???** Can you see if Boosted trees can overfit? How can you demonstrate it?\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "fDNzuC0xUstP"
   },
   "source": [
    "# Results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "d_iUYaTq2ZgL"
   },
   "source": [
    "Let's understand how our model is performing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "colab": {
     "height": 289
    },
    "colab_type": "code",
    "id": "kgds_rmq2_2t",
    "outputId": "931df046-e7fe-4e7e-9680-5106a40265ad"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:tensorflow:Calling model_fn.\n",
      "INFO:tensorflow:Done calling model_fn.\n",
      "INFO:tensorflow:Graph was finalized.\n",
      "INFO:tensorflow:Restoring parameters from /tmp/tmphU4KBq/model.ckpt-1000\n",
      "INFO:tensorflow:Running local_init_op.\n",
      "INFO:tensorflow:Done running local_init_op.\n"
     ]
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEICAYAAABS0fM3AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi40LCBodHRwOi8vbWF0cGxvdGxpYi5vcmcv7US4rQAAF35JREFUeJzt3XuYZHV95/H3BwYEFJVLO0EujhdEWY1ARpTH9YooEQXWGMRHksEQJyauxtXNisaNxuiz+GwiapJVx+toVEC8MF4jjhDiRsRB8ALIglyH27QKImhE4Lt/1Omk0nZPn+7pUzU95/16nnrq3M/3Vz1Tnzq/c+pUqgpJUn9tN+4CJEnjZRBIUs8ZBJLUcwaBJPWcQSBJPWcQSFLPGQTa6iT5cJK3NMNPTnL5iPZbSR4xon1dk+SZC1z33CR/OMu8/ZLckWT76csmeXGSr2xmuyN7rbV1MQi0Vauqf66qA+ZaLsmJSb4+ipq2ZlV1XVXdr6rumWHex6rqWVPj04Ov7WutbY9BoE4lWTbuGkatj23W0mYQaN6abo3XJbk0ya1JPpRkp2be05JsTPLaJDcDH2qmPzfJxUluS/IvSX5zaHsHJ/l2kp8lOR3YaWje05JsHBrfN8mnk0wm+XGSv0vyaOA9wGFNt8htzbL3SfLXSa5LckuS9yTZeWhbf5bkpiQ3JvmDOdp8bpL/leSCJLcnOSvJ7s28Fc2n65OSXAd8rZl+dJJLmjaf29Q57PGzvIa7Jfl808Zbm+F9pq378Dlq+bUwGj5qSnJeM/k7zWv2whle6wcn+VRTx9VJXjk079AkG5r935Lk7Zt7/bR1Mwi0UC8Gng08HHgk8Iaheb8B7A48BFid5GDgg8AfAXsA7wXWNW/UOwKfBT7arPNJ4Hdm2mHT7/154FpgBbA3cFpVXQa8DPhG0y3ywGaVU5raDgIe0Sz/F822jgT+O3AEsD/Qpr/+94E/APYC7gbeNW3+U4FHA89O8kjgE8CrgAngi8DnmvZOme013I5BgD4E2A/4BfB386xls6rqKc3g45rX7PTh+Um2Az4HfIfB63Y48Kokz24WeSfwzqq6f1P/GfPZv7YyVeXDx7wewDXAy4bGnwP8sBl+GnAXsNPQ/HcDfzVtG5czeON8CnAjkKF5/wK8ZWh7G5vhw4BJYNkMNZ0IfH1oPMCdwMOHph0GXN0MfxA4ZWjeI4ECHjFLm8+dtvyBTTu3ZxBKBTxsaP7/BM4YGt8OuAF42lyv4Qz7Pgi4dZ61LBta9g9neY3+Q3unvdZPAK6bVsfrgA81w+cBfwnsOe5/jz62/GFfphbq+qHha4EHD41PVtW/Do0/BFiV5BVD03Zs1inghmreXYa2N5N9gWur6u4W9U0AuwAXJpmaFgZvljT7vrDFPodNb/MOwJ6zzH/w8Dar6t4k1zP4dD3b9h4MkGQX4FTgSGC3Zv6uSbavfz8JPFctW+ohwIOnutka2wP/3AyfBLwZ+EGSq4G/rKrPL+L+NUIGgRZq36Hh/Rh8qp8y/Za21wNvraq3Tt9IkqcCeyfJUBjsB/xwhn1eD+yXZNkMYTB9nz9i0KXyn6rqhhm2ddMMbZjL9OV/1exnavpwDTcCj50aySCN9mVwVDDb9qZew9cABwBPqKqbkxwEXMQgyNrWsqWuZ3D0tP9MM6vqCuBFTRfS84Ezk+xRVXcu0v41Qp4j0EK9PMk+zUnKPwdO38yy7wNeluQJGbhvkqOS7Ap8g0Ef9yuT7JDk+cChs2znAgZv4Kc029gpyZOaebcA+0z1wVfVvc1+T03yIIAkew/1cZ8BnJjkwOYT+BtbtPmEoeXfDJxZM1ymObT9o5IcnmQHBm/uv2TQ7TVlttdwVwYhdlszb6ba5lPLbG4BHjbLvAuAn2Vw0n/nJNsneUySxwMkOSHJRPM6Tx013DvP/WsrYRBooT4OfAW4isGn97fMtmBVbQBeyuCE563AlQz6q6mquxh8ojwR+AnwQuDTs2znHuB5DE78XgdsbJaHwZU6lwA3J/lRM+21zb7OT3I78FUGn7Spqi8B72jWu7J5nstHgQ8DNzO4sumVsy1YVZcDJwB/y+CT+vOA5zXtnTLba/gOYOdmvfOBL29JLZvxJmBtc1XTcdPqvwd4LoPzE1c3tbwfeECzyJHAJUnuYHDi+Piq+sUCatBWIP+xa1aaW5JrGJyA/Oq4axmVJOcC/1BV7x93LdJi84hAknrOIJCknrNrSJJ6ziMCSeq5JfE9gj333LNWrFgx7jIkaUm58MILf1RVE3MttySCYMWKFWzYsGHcZUjSkpKkzTfm7RqSpL4zCCSp5wwCSeo5g0CSes4gkKSeMwgkqecMAknqOYNAknrOIJCknlsS3ywelxUnf2HB615zylGLWIkkdccjAknqOYNAknrOIJCknjMIJKnnDAJJ6jmDQJJ6ziCQpJ7rLAiSHJDk4qHH7UlelWT3JGcnuaJ53q2rGiRJc+ssCKrq8qo6qKoOAn4L+DnwGeBkYH1V7Q+sb8YlSWMyqq6hw4EfVtW1wDHA2mb6WuDYEdUgSZrBqILgeOATzfDyqrqpGb4ZWD7TCklWJ9mQZMPk5OQoapSkXuo8CJLsCBwNfHL6vKoqoGZar6rWVNXKqlo5MTHRcZWS1F+jOCL4beDbVXVLM35Lkr0AmudNI6hBkjSLUQTBi/j3biGAdcCqZngVcNYIapAkzaLTIEhyX+AI4NNDk08BjkhyBfDMZlySNCad/h5BVd0J7DFt2o8ZXEUkSdoK+M1iSeo5g0CSes4gkKSeMwgkqecMAknqOYNAknrOIJCknjMIJKnnDAJJ6jmDQJJ6ziCQpJ4zCCSp5wwCSeo5g0CSes4gkKSeMwgkqecMAknqOYNAknrOIJCknuv6x+sfmOTMJD9IclmSw5LsnuTsJFc0z7t1WYMkafO6PiJ4J/DlqnoU8DjgMuBkYH1V7Q+sb8YlSWPSWRAkeQDwFOADAFV1V1XdBhwDrG0WWwsc21UNkqS5dXlE8FBgEvhQkouSvD/JfYHlVXVTs8zNwPKZVk6yOsmGJBsmJyc7LFOS+q3LIFgGHAK8u6oOBu5kWjdQVRVQM61cVWuqamVVrZyYmOiwTEnqty6DYCOwsaq+2YyfySAYbkmyF0DzvKnDGiRJc+gsCKrqZuD6JAc0kw4HLgXWAauaaauAs7qqQZI0t2Udb/8VwMeS7AhcBbyEQfickeQk4FrguI5rkCRtRqdBUFUXAytnmHV4l/uVJLXnN4slqecMAknqOYNAknrOIJCknjMIJKnnDAJJ6jmDQJJ6ziCQpJ4zCCSp5wwCSeo5g0CSes4gkKSeMwgkqecMAknqOYNAknrOIJCknjMIJKnnDAJJ6jmDQJJ6rtPfLE5yDfAz4B7g7qpamWR34HRgBXANcFxV3dplHZKk2Y3iiODpVXVQVU39iP3JwPqq2h9Y34xLksZkHF1DxwBrm+G1wLFjqEGS1Og6CAr4SpILk6xupi2vqpua4ZuB5TOtmGR1kg1JNkxOTnZcpiT1V6fnCID/XFU3JHkQcHaSHwzPrKpKUjOtWFVrgDUAK1eunHEZSdKW6/SIoKpuaJ43AZ8BDgVuSbIXQPO8qcsaJEmb11kQJLlvkl2nhoFnAd8H1gGrmsVWAWd1VYMkaW5ddg0tBz6TZGo/H6+qLyf5FnBGkpOAa4HjOqxBkjSHzoKgqq4CHjfD9B8Dh3e1X0nS/PjNYknquVZBkOSxXRciSRqPtkcE/yfJBUn+JMkDOq1IkjRSrYKgqp4MvBjYF7gwyceTHNFpZZKkkWh9jqCqrgDeALwWeCrwriQ/SPL8roqTJHWv7TmC30xyKnAZ8AzgeVX16Gb41A7rkyR1rO3lo38LvB94fVX9YmpiVd2Y5A2dVCZJGom2QXAU8IuqugcgyXbATlX186r6aGfVSZI61/YcwVeBnYfGd2mmSZKWuLZBsFNV3TE10gzv0k1JkqRRahsEdyY5ZGokyW8Bv9jM8pKkJaLtOYJXAZ9MciMQ4DeAF3ZWlSRpZFoFQVV9K8mjgAOaSZdX1a+6K0uSNCrzufvo44EVzTqHJKGqPtJJVZKkkWkVBEk+CjwcuBi4p5lcgEEgSUtc2yOClcCBVeVvB0vSNqbtVUPfZ3CCWJK0jWl7RLAncGmSC4BfTk2sqqM7qUqSNDJtg+BNXRYhSRqftr9H8E/ANcAOzfC3gG+3WTfJ9kkuSvL5ZvyhSb6Z5MokpyfZcYG1S5IWQdvbUL8UOBN4bzNpb+CzLffxpwxuXz3lbcCpVfUI4FbgpJbbkSR1oO3J4pcDTwJuh3/7kZoHzbVSkn0Y3Ln0/c14GPyGwZnNImuBY+dXsiRpMbUNgl9W1V1TI0mWMfgewVzeAfwP4N5mfA/gtqq6uxnfyODo4tckWZ1kQ5INk5OTLcuUJM1X2yD4pySvB3Zufqv4k8DnNrdCkucCm6rqwoUUVlVrqmplVa2cmJhYyCYkSS20vWroZAZ9+d8D/gj4Ik13z2Y8CTg6yXOAnYD7A+8EHphkWXNUsA9ww0IKlyQtjrZXDd1bVe+rqt+tqhc0w5vtGqqq11XVPlW1Ajge+FpVvRg4B3hBs9gq4KwtqF+StIXa3mvoamY4J1BVD1vAPl8LnJbkLcBFwAcWsA1J0iKZz72GpuwE/C6we9udVNW5wLnN8FXAoW3XlSR1q23X0I+HHjdU1TsYXBYqSVri2nYNHTI0uh2DI4T5/JaBJGkr1fbN/G+Ghu9mcLuJ4xa9GknSyLX9qcqnd12IJGk82nYNvXpz86vq7YtTjiRp1OZz1dDjgXXN+POAC4AruihKkjQ6bYNgH+CQqvoZQJI3AV+oqhO6KkySNBpt7zW0HLhraPyuZpokaYlre0TwEeCCJJ9pxo9lcAtpSdIS1/aqobcm+RLw5GbSS6rqou7KkiSNStuuIYBdgNur6p3AxiQP7agmSdIItf2pyjcyuFnc65pJOwD/0FVRkqTRaXtE8F+Ao4E7AarqRmDXroqSJI1O2yC4q/n9gQJIct/uSpIkjVLbIDgjyXsZ/LrYS4GvAu/rrixJ0qi0vWror5vfKr4dOAD4i6o6u9PKJEkjMWcQJNke+Gpz4znf/CVpGzNn11BV3QPcm+QBI6hHkjRibb9ZfAfwvSRn01w5BFBVr+ykKknSyLQNgk83j9aS7AScB9yn2c+ZVfXG5otopwF7ABcCv1dVd82+JUlSlzYbBEn2q6rrqmoh9xX6JfCMqrojyQ7A15vbVLwaOLWqTkvyHuAk4N0L2L4kaRHMdY7gs1MDST41nw3XwB3N6A7No4BnAGc209cyuIGdJGlM5gqCDA0/bL4bT7J9kouBTQyuOPohcFtV3d0sshHYe5Z1VyfZkGTD5OTkfHctSWppriCoWYZbqap7quogBj9scyjwqHmsu6aqVlbVyomJifnuWpLU0lwnix+X5HYGRwY7N8M041VV92+zk6q6Lck5wGEMvp28rDkq2Ae4YYG1S5IWwWaPCKpq+6q6f1XtWlXLmuGp8c2GQJKJJA9shncGjgAuA84BXtAstgo4a8ubIUlaqLaXjy7EXsDa5pvJ2wFnVNXnk1wKnJbkLcBFwAc6rEGSNIfOgqCqvgscPMP0qxicL9imrTj5C1u0/jWnHLVIlUjS5s3nF8okSdsgg0CSes4gkKSeMwgkqecMAknquS4vH5W0DdqSK+K8Gm7r5BGBJPWcQSBJPWcQSFLPGQSS1HMGgST1nEEgST1nEEhSzxkEktRzBoEk9ZxBIEk9ZxBIUs8ZBJLUcwaBJPVcZ0GQZN8k5yS5NMklSf60mb57krOTXNE879ZVDZKkuXV5RHA38JqqOhB4IvDyJAcCJwPrq2p/YH0zLkkak86CoKpuqqpvN8M/Ay4D9gaOAdY2i60Fju2qBknS3EZyjiDJCuBg4JvA8qq6qZl1M7B8lnVWJ9mQZMPk5OQoypSkXuo8CJLcD/gU8Kqqun14XlUVUDOtV1VrqmplVa2cmJjoukxJ6q1OgyDJDgxC4GNV9elm8i1J9mrm7wVs6rIGSdLmdXnVUIAPAJdV1duHZq0DVjXDq4CzuqpBkjS3Ln+8/knA7wHfS3JxM+31wCnAGUlOAq4FjuuwBknSHDoLgqr6OpBZZh/e1X6nW3HyF0a1K0n6NVvyHnTNKUctYiWz85vFktRzBoEk9ZxBIEk9ZxBIUs8ZBJLUcwaBJPWcQSBJPWcQSFLPGQSS1HMGgST1nEEgST1nEEhSzxkEktRzXd6GWltgKdyxUBo1/190wyMCSeo5g0CSes4gkKSeMwgkqecMAknquc6CIMkHk2xK8v2habsnOTvJFc3zbl3tX5LUTpdHBB8Gjpw27WRgfVXtD6xvxiVJY9RZEFTVecBPpk0+BljbDK8Fju1q/5Kkdkb9hbLlVXVTM3wzsHy2BZOsBlYD7LfffiMoTeqPLfli1lLll9FmN7aTxVVVQG1m/pqqWllVKycmJkZYmST1y6iD4JYkewE0z5tGvH9J0jSjDoJ1wKpmeBVw1oj3L0mapsvLRz8BfAM4IMnGJCcBpwBHJLkCeGYzLkkao85OFlfVi2aZdXhX+5S0devjSeqlwG8WS1LPGQSS1HP+MI20Bbw2vR+29S4tjwgkqecMAknqOYNAknrOIJCknjMIJKnnvGpIi2ZLr6zYkqtovHpHWjiPCCSp5wwCSeo5u4a2Qdv6l18W27heL/9O2lp4RCBJPWcQSFLPGQSS1HMGgST1nCeLtdXw5Kk0Hh4RSFLPGQSS1HNjCYIkRya5PMmVSU4eRw2SpIGRB0GS7YG/B34bOBB4UZIDR12HJGlgHEcEhwJXVtVVVXUXcBpwzBjqkCQxnquG9gauHxrfCDxh+kJJVgOrm9E7klw+y/b2BH60qBUuLX1uv23vp960PW/7tUnzbftD2iy01V4+WlVrgDVzLZdkQ1WtHEFJW6U+t9+22/a+6art4+gaugHYd2h8n2aaJGkMxhEE3wL2T/LQJDsCxwPrxlCHJIkxdA1V1d1J/ivwj8D2wAer6pIt2OSc3UfbuD6337b3k21fZKmqLrYrSVoi/GaxJPWcQSBJPbdkgmCu21IkuU+S05v530yyYvRVdqNF21+d5NIk302yPkmra4eXira3JEnyO0kqyTZzaWGbtic5rvn7X5Lk46OusSst/t3vl+ScJBc1//afM446u5Dkg0k2Jfn+LPOT5F3Na/PdJIds0Q6raqt/MDip/EPgYcCOwHeAA6ct8yfAe5rh44HTx133CNv+dGCXZviPt5W2t21/s9yuwHnA+cDKcdc9wr/9/sBFwG7N+IPGXfcI274G+ONm+EDgmnHXvYjtfwpwCPD9WeY/B/gSEOCJwDe3ZH9L5YigzW0pjgHWNsNnAocnyQhr7Mqcba+qc6rq583o+Qy+m7GtaHtLkr8C3gb86yiL61ibtr8U+PuquhWgqjaNuMautGl7Afdvhh8A3DjC+jpVVecBP9nMIscAH6mB84EHJtlroftbKkEw020p9p5tmaq6G/gpsMdIqutWm7YPO4nBJ4VtxZztbw6L962qbe2Xbdr87R8JPDLJ/01yfpIjR1Zdt9q0/U3ACUk2Al8EXjGa0rYK831f2Kyt9hYTmr8kJwArgaeOu5ZRSbId8HbgxDGXMi7LGHQPPY3BkeB5SR5bVbeNtarReBHw4ar6mySHAR9N8piqunfchS01S+WIoM1tKf5tmSTLGBwq/ngk1XWr1S05kjwT+HPg6Kr65YhqG4W52r8r8Bjg3CTXMOgvXbeNnDBu87ffCKyrql9V1dXA/2MQDEtdm7afBJwBUFXfAHZicFO2PljUW/UslSBoc1uKdcCqZvgFwNeqOauyxM3Z9iQHA+9lEALbSh/xlM22v6p+WlV7VtWKqlrB4BzJ0VW1YTzlLqo2/+4/y+BogCR7MugqumqURXakTduvAw4HSPJoBkEwOdIqx2cd8PvN1UNPBH5aVTctdGNLomuoZrktRZI3Axuqah3wAQaHhlcyOMly/PgqXjwt2/6/gfsBn2zOj19XVUePrehF1LL926SWbf9H4FlJLgXuAf6sqpb8kXDLtr8GeF+S/8bgxPGJ28iHP5J8gkHA79mcA3kjsANAVb2HwTmR5wBXAj8HXrJF+9tGXjdJ0gItla4hSVJHDAJJ6jmDQJJ6ziCQpJ4zCCSp5wwCSeo5g0CSeu7/AxCMfRLOcHkVAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "pred_dicts = list(est.predict(eval_input_fn))\n",
    "probs = pd.Series([pred['probabilities'][1] for pred in pred_dicts])\n",
    "\n",
    "probs.plot(kind='hist', bins=20, title='predicted probabilities');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "jq-CPquY-bG3"
   },
   "source": [
    "**???** Why are the probabilities right skewed?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "qrvEyh4Q3YgC"
   },
   "source": [
    "Let's plot an ROC curve to understand model performance for various predicition probabilities."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "colab": {
     "height": 307
    },
    "colab_type": "code",
    "id": "ByhMg-_a3K_q",
    "outputId": "f02d1b06-cc0a-43ac-b95e-3e9192a59099"
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEWCAYAAACJ0YulAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi40LCBodHRwOi8vbWF0cGxvdGxpYi5vcmcv7US4rQAAIABJREFUeJzt3XmYHVWd//H3pzvp7BskIZAQkkAghkXAsLoQVgOyjoqgyDCD8qiguKEwOIioP8ZhYBRlhDhiFEU2R40aQUUgv4kEkrBEkoCGJJCEhOxbZ+sk3/mjKpeb5nZ3hXTd233783qefrrq1Kmqb3X3c7996lSdo4jAzMwMoKbSAZiZWdvhpGBmZgVOCmZmVuCkYGZmBU4KZmZW4KRgZmYFTgpmZlbgpGBVR9ICSZskbZC0VNIEST0b1TlR0p8lrZe0VtJvJI1uVKe3pG9LejU91svpev/yXpFZ+TgpWLU6JyJ6AkcCRwHX7dwg6QTgD8Cvgf2A4cDzwBRJI9I6dcCjwKHAOKA3cAKwEjg2r6Aldcrr2GZZOClYVYuIpcAjJMlhp38HfhIR34mI9RGxKiK+AkwFbkzrXAoMBS6IiNkRsSMilkXE1yNiUqlzSTpU0h8lrZL0uqR/ScsnSPpGUb2xkhYVrS+Q9GVJM4H6dPmhRsf+jqTb0+U+kn4oaYmkxZK+Ial2D39UZoCTglU5SUOAM4G56Xp34ETgwRLVHwBOT5dPAx6OiA0Zz9ML+BPwMEnr4yCSlkZWFwPvA/oC9wFnpcck/cC/ELg3rTsB2Jae4yjgDOBju3EusyY5KVi1+pWk9cBCYBnw1bR8L5K/+yUl9lkC7Owv2LuJOk05G1gaEbdGxOa0BfLUbux/e0QsjIhNEfEK8AxwQbrtFGBjREyVtA9wFvDZiKiPiGXAfwIX7ca5zJrkpGDV6vyI6AWMBUbxxof9amAHsG+JffYFVqTLK5uo05T9gZffUqSJhY3W7yVpPQB8mDdaCQcAnYElktZIWgPcBQzcg3ObFTgpWFWLiCdIbrf8R7peDzwJfLBE9Qt545bPn4D3SuqR8VQLgRFNbKsHuhetDyoVaqP1B4Gx6e2vC3gjKSwEtgD9I6Jv+tU7Ig7NGKdZs5wUrCP4NnC6pLen69cC/yjpM5J6SeqXdgSfAHwtrXMPyQfwLySNklQjaW9J/yLprBLn+C2wr6TPSuqSHve4dNtzJH0Ee0kaBHy2pYAjYjnwOPAjYH5EzEnLl5A8OXVr+shsjaQDJZ30Fn4uZm/ipGBVL/2A/QlwQ7r+v8B7gX8g6Td4haTD9l0R8fe0zhaSzuYXgT8C64CnSW5DvamvICLWk3RSnwMsBf4OnJxuvofkkdcFJB/o92cM/d40hnsblV8K1AGzSW6HPcTu3eoya5I8yY6Zme3kloKZmRU4KZiZWYGTgpmZFTgpmJlZQbsbfKt///4xbNiwSodhZtauzJgxY0VEDGipXrtLCsOGDWP69OmVDsPMrF2R9EqWer59ZGZmBU4KZmZW4KRgZmYFTgpmZlbgpGBmZgVOCmZmVpBbUpB0t6Rlkl5oYrsk3S5prqSZko7OKxYzM8smz/cUJgDfIxmyuJQzgZHp13HA99PvZmYVtblhe6VDeBMJunSqzf08uSWFiJgsaVgzVc4DfhLJ2N1TJfWVtG86iYiZWUX87KlXuP6XJW9wVNQZo/dh/KVjcj9PJd9oHsyu89IuSsvelBQkXQFcATB06NCyBGdmHdOrqzZSWyO+eMYhlQ5lF8P7d2+5UitoF8NcRMR4YDzAmDFjPCuQmbW6R+e8zucfeJ6NW7dRV1vDJ8ceWOmQKqKSSWExsH/R+pC0zMys7F56fT1rNzVw2YnDGL1f70qHUzGVTAoTgask3UfSwbzW/Qlmlpe1Gxu49O6nWLd5W8ntazZuBeDaM0fRtXP+HbptVW5JQdLPgbFAf0mLgK8CnQEi4k5gEnAWMBfYCPxTXrGYmS1cvZHnF63l2GF7MahP15J1hu3dvUMnBMj36aOLW9gewJV5nd/Mqtf9017loRmLdmuf+i3JY6Yff88ITh+9Tx5hVYV20dFsZlbstzOXMGfJeo4Y0ifzPn2713DqqIG7tU9H5KRgZmW3uWE7X/vNLNZuanhL+89Zso6R+/Tk3o8f38qRmZOCmZXdvOX1/Pzphezbpys9u+z+x1C/7nWccsjAHCIzJwUzK4sfTJ7HwtUbAVhVnzzp89VzDmXcYYMqGZY14qRgZrmr37KNb06aQ7fOtXTtnIzDuW+frowY0KPCkVljTgpm1qqmzF3BM6+s3qVs6/YdAHz+9IP5+HtGVCIsy8hJwcxa1VcnzmLusg1vKq8RDN27POP32FvnpGDWwUUED7+wlDVv8UmgxtZs3Mr7jtiX73zoyF3KJVFbo1Y5h+XHScGsg5u/op5P/uyZVj3moN5d6VTriR3bIycFszKbv6Kel0vcXqmUxWs2AfDNCw7j1FGt86bvwF5dWuU4Vn5OCmZldvmEacxbUV/pMN7kgL16NDkmkHUcTgpmZbJ2YwPzV9azdlMDp71tH64+dWSlQyro2rmGgwb2rHQY1gY4KZiVwbzlG7jwridZsSF5aWtIv24c7jF4rA1yUjBrZTt2BEvWbS6sr9vUwMd+PJ0I+K+PHE23ulqOHtqvghGaNc1JwayV3fKHl/j+4y/vUtazSyfuu+J4Dhvs1oG1bU4KZm9Bw/Yd7IjS04W/vm4zfbp15vqz3lYoe8ewfhw4wPfsre1zUjDbTTNeWcVF46fSsL10UgDYf69uXHjM/k1uN2urnBTMdtOi1Zto2B5c/q7h7NWjrmQdT+Ri7ZWTgrVrs15by2U/msaWhu1lO+fOFsJHjhvKCN8SsirjpGDt2vwV9Sxfv4Xzj9yPvt1L/9eeh7161DFsbw/7bNXHScHahc0N27n0h0+zon7LLuUbNm8D4MqTD2LkPr0qEZpZVXFSsHZh2botPL1gFW/fvy/79+u2y7a+3TszrL//azdrDU4KVtIdj81l8t+WVzqMgs3bkklaLj3+AN7/jiEVjsasenlsWyvpFzMWlZwopVK6dqrh3SP7c/QBfhPYLE9uKViTTjyoP9+9+KhKh2FmZeSWgpmZFTgpmJlZgZOCmZkVuE/BAHhh8VqeKHraaPXGrRWMxswqxUnBALjtj3/jzy8u26VsuJ/9N+twnBSqwGMvLmP5+i0tV2zG4tWbOGJIHx76xImFsrpOvrto1tE4KbRzKzds4Z8mTGuVY532toFOBGYdXK5JQdI44DtALfDfEfFvjbYPBX4M9E3rXBsRk/KMqdrsHLHzS+MO4bwjB+/RsQb07NIaIZlZO5ZbUpBUC9wBnA4sAqZJmhgRs4uqfQV4ICK+L2k0MAkYlldM1axf9zoG9+3WckUzs2bkea/gWGBuRMyLiK3AfcB5jeoE0Dtd7gO8lmM8ZmbWgjyTwmBgYdH6orSs2I3AJZIWkbQSPl3qQJKukDRd0vTly9vOIG1mZtWm0h3NFwMTIuJWSScA90g6LCJ2FFeKiPHAeIAxY8Y0PTFuldqybTur6ku/N7Bs3Z49dWRmVizPpLAYKJ65fEhaVuxyYBxARDwpqSvQH1iGFVx29zSenLey2Tqda/3UkJntuTyTwjRgpKThJMngIuDDjeq8CpwKTJD0NqAr4PtDqYhg+45g2frNHD64Dx85bmjJep1razjz8EFljs7MqlFuSSEitkm6CniE5HHTuyNilqSbgOkRMRH4AvADSZ8j6XS+LCI63O2hpnzpoZk8OGMRAOe8fT8uOrZ0UjAzay259imk7xxMalR2Q9HybOCdecbQns1bUc/QvbrzwXcM4YxD3RIws/xVuqPZSvjuo3/nh1Pms37zNk48cG8+ferISodkZh2Ek0Ib9NzCNdRIXHLcULcQzKysnBTaqP36duVr5x1W6TDMrIPxc4xmZlbglkIb8sispUyYsoA5S9cxpJ/HMTKz8nNLoQ155IWlzHh1NQcP7MX5ezjiqZnZW+GWQhvw7KurmfCXBcx4ZTX79O7CA584odIhmVkH5ZZCG/Dr515j4vOv0bm2hlMOGVjpcMysA3NLoQKWrt3Mj6bML0yQ89T8VfTq0onHvji2soGZWYfnpFABD7+whLsmz6NHXS01EgBHH9CvwlGZmWVICpIEfAQYERE3pVNoDoqIp3OPrkpsbtjOfU+/yqaGZETwGa+sAmDKtafQt3tdJUMzM9tFlpbCfwE7gFOAm4D1wC+AY3KMq6pMnbeSG38ze5eyAb260K2utkIRmZmVliUpHBcRR0t6FiAiVkvyv7cZ7NgR/GH2Up6an7QMHvzECRw+uA8AnWpEJ8+BYGZtTJak0CCplmRoayQNIGk5WAtmvbaOT/z0GQBqBIN6d6VrZ7cOzKztypIUbgd+CQyU9E3gA8C/5hpVFXhu4RqmL0haCLd+8O2cPGoge/VwA8vM2rYWk0JE/EzSDJIZ0gScHxFzco+sHVu8ZhPn3zGlsD5iQA8nBDNrF7I8fXRPRHwUeLFEmZWwaes2AL5w+sGcceggDt6nZ4UjMjPLJsvto0OLV9L+hXfkE051Gda/B4cM6lXpMMzMMmvy8RdJ10laDxwhaZ2k9en6MuDXZYuwnanfso2VG7ZWOgwzs7ekyZZCRNwM3Czp5oi4rowxtVv1W7Zx7Df/RP3W7QB09iOnZtbOZOlovk5SP2Ak0LWofHKegbVVEUFE6W0btmyjfut2zn37fpx08ADGHjKgvMGZme2hLB3NHwOuBoYAzwHHA0+SvOHc4Vx57zNM+uvSZuuccODevP8dQ8oUkZlZ68nS0Xw1yZAWUyPiZEmjgP+Xb1ht18vL6jloYE/OOWK/kts7dxLjDh1U5qjMzFpHlqSwOSI2S0JSl4h4UdIhuUfWhh00oCdXnzay0mGYmbW6LElhkaS+wK+AP0paDbySb1hmZlYJWTqaL0gXb5T0GNAHeDjXqNqgO594mV8+s5j5K+oZ3r9HpcMxM8tFs0khfVFtVkSMAoiIJ8oSVRv05xeXsXzDFk4ZNZAPjnEnsplVp2aTQkRsl/SSpKER8Wq5gmqrDt6nJ3d+1C9zm1n1ytKn0A+YJelpoH5nYUScm1tUZmZWEVmSgofJNjPrILJ0NHfYfgQzs47Gg/OYmVlBrklB0ri0o3qupGubqHOhpNmSZkm6N894zMyseVn6FJDUDRgaES9lPXD6OOsdwOnAImCapIkRMbuozkjgOuCdEbFa0sDdij4Hi9ds4nczX3vToHdL1m5icN9ulQnKzKxMsgyIdw7wH0AdMFzSkcBNGZ4+OhaYGxHz0uPcB5wHzC6q83HgjohYDRARy3b/ElrXT/6ygLsmzyu57YQRe5c5GjOz8srSUriR5AP+cYCIeE7S8Az7DQYWFq0vAo5rVOdgAElTgFrgxoio6NvS23YEPepqmfaV0960rVvn2gpEZGZWPlmSQkNErJVUXNbEjAJv6fwjgbEkQ3NPlnR4RKwpriTpCuAKgKFDh7bSqZsmie51me6smZlVlSwdzbMkfRiolTRS0neBv2TYbzGwf9H6kLSs2CJgYkQ0RMR84G8kSWIXETE+IsZExJgBA/KZuGZzw3amzF3BotUbczm+mVl7kCUpfBo4FNgC3AusBT6bYb9pwEhJwyXVARcBExvV+RVJKwFJ/UluJ5W+oZ+zn059hY/891M8Mut1enZxK8HMOqYsn36jIuJ64PrdOXBEbJN0FfAISX/B3RExS9JNwPSImJhuO0PSbGA7cE1ErNy9S8hu4aqNbNiyreS2V1clLYT7rzieoXt3zysEM7M2TdHUhMM7KyTDZQ8CHgLuj4gXyhFYU8aMGRPTp0/f7f1eXr6BU29t/uXsuk41zLlpHLU1araemVl7I2lGRIxpqV6WYS5OljQIuBC4S1JvkuTwjVaIs2zWbWoA4KqTD+Kwwb1L1hnct7sTgpl1aJlunkfEUuD2tNXwJeAGoF0lhZ3eMawfJx9S8XfkzMzapBY7miW9TdKNkv4K7HzyyLPMmJlVoSwthbuB+4H3RsRrOcfT6tZubODkWx9nVf1WAGrl20NmZk3J0qdwQjkCycvK+i2sqt/Kew/dh6OG9uOYYXtVOiQzszaryaQg6YGIuDC9bVT8iJKAiIgjco+uFZ11+L6cd+TgSodhZtamNddSuDr9fnY5AjEzs8prMilExJJ08VMR8eXibZK+BXz5zXu1Hdu27+CKe2awYGV9y5XNzAzINszF6SXKzmztQFrb2k0N/PnFZdTV1nD2Efu6L8HMLIPm+hQ+CXwKGCFpZtGmXsCUvANrLR8+biiXnjCs0mGYmbULzfUp3Av8HrgZKJ5Kc31ErMo1KjMzq4jmkkJExAJJVzbeIGkvJwYzs+rTUkvhbGAGySOpxW99BTAix7jMzKwCmnv66Oz0e5apN83MrApkGfvonZJ6pMuXSLpNUv5zYpqZWdlleST1+8BGSW8HvgC8DNyTa1R76A+zlvLTqa9WOgwzs3Yny4B42yIiJJ0HfC8ifijp8rwDe6u27wg+8dMZ7AiQYL8+3SodkplZu5ElKayXdB3wUeDdkmqAzvmGtWd2BFx58oFcefJBdK/zfMtmZllluX30IWAL8M/pZDtDgFtyjaoVdOlU64RgZrabsgydvVTSz4BjJJ0NPB0RP8k/tOwWr9nEghXJGEfbdzQ/57SZmTWtxaQg6UKSlsHjJO8qfFfSNRHxUM6xZXbZ3U/z92Ubdinr2cWtBDOz3ZXlk/N64JiIWAYgaQDwJ6DNJIWNW7dz0sEDuPLkgwCorYEjhvStcFRmZu1PlqRQszMhpFaSrS+irAb06sKxwz0SqpnZnsiSFB6W9Ajw83T9Q8Ck/EIyM7NKydLRfI2kfwDelRaNj4hf5huWmZlVQtbe2L8A24EdwLT8wjEzs0rKMvbRx4CngQuADwBTJf1z3oGZmVn5ZWkpXAMcFRErASTtTdJyuDvPwMzMrPyyPEW0ElhftL4+LTMzsyqTpaUwF3hK0q9JJtc5D5gp6fMAEXFbjvGZmVkZZUkKL6dfO/06/d6r9cMxM7NKyvJI6tfKEYiZmVVem3sz2czMKifXpCBpnKSXJM2VdG0z9d4vKSSNyTMeMzNrXm5DiUqqBe4ATgcWAdMkTYyI2Y3q9QKuBp7a3XNMfP41Jv9tOavqt7ZGyGZmHV6Wl9cOlvSopBfS9SMkfSXDsY8F5kbEvIjYCtxH8uRSY18HvgVs3o24Abjz8Zf5zfOvsVePOo4Z1m93dzczs0ay3D76AXAd0AAQETOBizLsNxhYWLS+KC0rkHQ0sH9E/K65A0m6QtJ0SdOXL1++y7Z3jxzAlGtP4UPHDM0QkpmZNSdLUugeEU83Ktu2pydO53q+DfhCS3UjYnxEjImIMQMGDNjTU5uZWROyJIUVkg4keXENSR8AlmTYbzGwf9H6kLRsp17AYcDjkhYAxwMT3dlsZlY5WTqarwTGA6MkLQbmA5dk2G8aMFLScJJkcBHw4Z0bI2It0H/nuqTHgS9GxPTM0ZuZWavK8vLaPOA0ST1IZmFb39I+6X7bJF0FPALUAndHxCxJNwHTI2LingRuZmatr8WkIOmGRusARMRNLe0bEZNoNEtbRNzQRN2xLR3PzMzyleX2UX3RclfgbGBOPuGYmVklZbl9dGvxuqT/ILklZGZmVeatDHPRneRJIjMzqzJZ+hT+Svo4KkmH8QCgxf4EMzNrf7L0KZxdtLwNeD0i9vjlNTMza3uaTQrpoHaPRMSoMsVjZmYV1GyfQkRsB16S5IGFzMw6gCy3j/oBsyQ9TdHjqRFxbm5RmZlZRWRJCv+aexRmZtYmZEkKZ0XEl4sLJH0LeCKfkMzMrFKyvKdweomyM1s7EDMzq7wmWwqSPgl8ChghaWbRpl7AlLwDMzOz8mvu9tG9wO+Bm4Fri8rXR8SqXKMyM7OKaDIppPMdrAUuLl84ZmZWSW9l7CMzM6tSTgpmZlbgpGBmZgVOCmZmVuCkYGZmBVneaG5zXly6jidfXsnK+i3s17dbpcMxM6sa7TIpfPN3c/j/f18BwLhDu1Y4GjOz6tEuk8K27cFRQ/vyo8uOoU+3zpUOx8ysarTLpADQuaaGvt3rKh2GmVlVcUezmZkVOCmYmVmBk4KZmRU4KZiZWYGTgpmZFTgpmJlZgZOCmZkVOCmYmVmBk4KZmRXkmhQkjZP0kqS5kq4tsf3zkmZLminpUUkH5BmPmZk1L7ekIKkWuAM4ExgNXCxpdKNqzwJjIuII4CHg3/OKx8zMWpZnS+FYYG5EzIuIrcB9wHnFFSLisYjYmK5OBYbkGI+ZmbUgz6QwGFhYtL4oLWvK5cDvS22QdIWk6ZKmL1++vBVDNDOzYm2io1nSJcAY4JZS2yNifESMiYgxAwYMKG9wZmYdSJ5DZy8G9i9aH5KW7ULSacD1wEkRsSXHeMzMrAV5thSmASMlDZdUB1wETCyuIOko4C7g3IhYlmMsZmaWQW5JISK2AVcBjwBzgAciYpakmySdm1a7BegJPCjpOUkTmzicmZmVQa4zr0XEJGBSo7IbipZPy/P8Zma2e9pER7OZmbUNTgpmZlbgpGBmZgVOCmZmVuCkYGZmBU4KZmZW4KRgZmYFTgpmZlbgpGBmZgVOCmZmVuCkYGZmBU4KZmZW4KRgZmYFTgpmZlbgpGBmZgVOCmZmVuCkYGZmBU4KZmZW4KRgZmYFTgpmZlbgpGBmZgVOCmZmVuCkYGZmBU4KZmZW4KRgZmYFTgpmZlbQ7pLCuk0NrKrfWukwzMyqUrtLCq+s2shLr6+nd7fOlQ7FzKzqdKp0ALurrraGSZ95N8P6d690KGZmVafdJYUaidH79a50GGZmVand3T4yM7P8OCmYmVmBk4KZmRXkmhQkjZP0kqS5kq4tsb2LpPvT7U9JGpZnPGZm1rzckoKkWuAO4ExgNHCxpNGNql0OrI6Ig4D/BL6VVzxmZtayPFsKxwJzI2JeRGwF7gPOa1TnPODH6fJDwKmSlGNMZmbWjDwfSR0MLCxaXwQc11SdiNgmaS2wN7CiuJKkK4Ar0tUtkl7IJeL2oT+Nfj4dTEe+/o587eDr39PrPyBLpXbxnkJEjAfGA0iaHhFjKhxSxfj6O+71d+RrB19/ua4/z9tHi4H9i9aHpGUl60jqBPQBVuYYk5mZNSPPpDANGClpuKQ64CJgYqM6E4F/TJc/APw5IiLHmMzMrBm53T5K+wiuAh4BaoG7I2KWpJuA6RExEfghcI+kucAqksTRkvF5xdxO+Po7ro587eDrL8v1y/+Ym5nZTn6j2czMCpwUzMysoM0mhY4+REaG6/+8pNmSZkp6VFKmZ5Dbg5auvaje+yWFpKp6TDHL9Uu6MP39z5J0b7ljzFOGv/2hkh6T9Gz6939WJeLMg6S7JS1r6l0sJW5PfzYzJR3d6kFERJv7IumYfhkYAdQBzwOjG9X5FHBnunwRcH+l4y7z9Z8MdE+XP1kt15/l2tN6vYDJwFRgTKXjLvPvfiTwLNAvXR9Y6bjLfP3jgU+my6OBBZWOuxWv/z3A0cALTWw/C/g9IOB44KnWjqGtthQ6+hAZLV5/RDwWERvT1akk74FUgyy/e4Cvk4yVtbmcwZVBluv/OHBHRKwGiIhlZY4xT1muP4CdM231AV4rY3y5iojJJE9iNuU84CeRmAr0lbRva8bQVpNCqSEyBjdVJyK2ATuHyKgGWa6/2OUk/z1UgxavPW0y7x8RvytnYGWS5Xd/MHCwpCmSpkoaV7bo8pfl+m8ELpG0CJgEfLo8obUJu/vZsNvaxTAX1jRJlwBjgJMqHUs5SKoBbgMuq3AoldSJ5BbSWJIW4mRJh0fEmopGVT4XAxMi4lZJJ5C863RYROyodGDVoK22FDr6EBlZrh9JpwHXA+dGxJYyxZa3lq69F3AY8LikBST3VSdWUWdzlt/9ImBiRDRExHzgbyRJohpkuf7LgQcAIuJJoCvJYHEdQabPhj3RVpNCRx8io8Xrl3QUcBdJQqime8rNXntErI2I/hExLCKGkfSnnBsR0ysTbqvL8rf/K5JWApL6k9xOmlfOIHOU5fpfBU4FkPQ2kqSwvKxRVs5E4NL0KaTjgbURsaQ1T9Ambx9FfkNktAsZr/8WoCfwYNq//mpEnFuxoFtJxmuvWhmv/xHgDEmzge3ANRFRFa3kjNf/BeAHkj5H0ul8WbX8Qyjp5yQJv3/aZ/JVoDNARNxJ0odyFjAX2Aj8U6vHUCU/SzMzawVt9faRmZlVgJOCmZkVOCmYmVmBk4KZmRU4KZiZWYGTgrVpkj4jaY6knzVTZ6yk35YzrqZIOnfnyJ6Szpc0umjbTekLh+WKZaykE8t1PqsObfI9BbMinwJOi4hFlQ4ki/Q5+p3vUpwP/BaYnW67obXPJ6lTOvZXKWOBDcBfWvu8Vr3cUrA2S9KdJEMo/17S5yQdK+nJdBz9v0g6pMQ+J0l6Lv16VlKvtPwaSdPSMei/1sT5Nkj6z3SOgkclDUjLj0wHnpsp6ZeS+qXln9Ebc1rcl5ZdJul76X/o5wK3pLEcKGmCpA+k8wU8WHTeQktH0hnpNT4j6UFJPUvE+bikb0uaDlwt6Rwlc4o8K+lPkvZRMr/IJ4DPped/t6QBkn6R/hymSXrnHvx6rFpVevxwf/mruS9gAdA/Xe4NdEqXTwN+kS6PBX6bLv8GeGe63JOkNXwGyRj8IvlH6LfAe0qcK4CPpMs3AN9Ll2cCJ6XLNwHfTpdfA7qky33T75cV7TcB+EDR8SeQDMnSiWSohh5p+feBS0jG75lcVP5l4IYScT4O/FfRej/eeBH1Y8Ct6fKNwBeL6t0LvCtdHgrMqfTv119t78u3j6w96QP8WNJIkg/wziXqTAFuS/sg/iciFkk6gyQxPJvW6UkygNzkRvvuAO5Pl38K/I+kPiQf+E+k5T8Gdv6XPxP4maRfkYxHlEkkQzk8DJwj6SHgfcCXSEa6HQ1MSYcuqQOebOIw9xctDwHuVzKufh0wv4l9TgNG641pR3pL6hkRG7LGbtXPScHak68Dj0XEBentkccbV4iIf5P0O5LxYaZIei9JC+HmiLhrN8/X0hgw7yOZKesc4HpJh+/Gse8DriLNKRRcAAABbElEQVQZt2t6RKxX8mn9x4i4OMP+9UXL3wVui4iJksaStBBKqQGOj4hqm5jIWpH7FKw96cMbwwRfVqqCpAMj4q8R8S2SETdHkQyu9s87789LGixpYInda0hu7wB8GPjfiFgLrJb07rT8o8ATSuZ12D8iHiO5zdOHpAVSbD3JUN+lPEEy7eLHSRIEJCO+vlPSQWmcPSQd3MT+xYp/Lv9YVN74/H+gaEIaSUdmOLZ1ME4K1p78O3CzpGdpupX7WUkvSJoJNAC/j4g/kNxPf1LSX0mmby31YV0PHKtk0vRTSPoPIPmgvSU95pFpeS3w0/R4zwK3x5snubkPuCbtAD6weENEbCfp2zgz/U5ELCdJdj9Pz/UkSVJryY0ko+XOAFYUlf8GuGBnRzPwGWBM2jE+m6Qj2mwXHiXVLCVpQ0S86Wkfs47ELQUzMytwS8HMzArcUjAzswInBTMzK3BSMDOzAicFMzMrcFIwM7OC/wOKsZX5j0YpnAAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "from sklearn.metrics import roc_curve\n",
    "from matplotlib import pyplot as plt\n",
    "\n",
    "fpr, tpr, _ = roc_curve(y_eval, probs)\n",
    "plt.plot(fpr, tpr)\n",
    "plt.title('ROC curve')\n",
    "plt.xlabel('false positive rate')\n",
    "plt.ylabel('true positive rate')\n",
    "plt.xlim(0,)\n",
    "plt.ylim(0,);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "9HKAt75V3O8E"
   },
   "source": [
    "**???** What does true positive rate and false positive rate refer to for this dataset?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright 2019 Google Inc. Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "ASL_b_boosted_trees_estimator",
   "provenance": [],
   "version": "0.3.2"
  },
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
